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Abstract —In this paper, we highlight the class of spatially 
coupled codes and discuss their applicability to long-haul and 
submarine optical communication systems. We first demonstrate 
how to optimize irregular spatially coupled LDPC codes for their 
use in optical communications with limited decoding hardware 
complexity and then present simulation results with an FPGA- 
based decoder where we show that very low error rates can 
be achieved and that conventional block-based LDPC codes can 
be outperformed. In the second part of the paper, we focus on 
the combination of spatially coupled LDPC codes with different 
demodulators and detectors, important for future systems with 
adaptive modulation and for varying channel characteristics. 
We demonstrate that SC codes can be employed as universal, 
channel-agnostic coding schemes. 

Index Terms —Error correction codes. Low-density parity- 
check codes. Spatial coupling. Optical Communications 

I. Introduction 

Modern high-speed optical communication systems require 
high-performing Forward Error Correction (FEC) implemen¬ 
tations that support throughputs of 100 Gbit/s or multiples 
thereof, that have low power consumption, that realize Net 
Coding Gains (NCGs) close to the theoretical limits at a target 
Bit Error Rate (BER) of 10“^^, and that are preferably adapted 
to the peculiarities of the optical channel. 

Especially with the advent of coherent transmission schemes 
and the utilization of high resolution Analog-to-Digital Con¬ 
verters (ADCs), soft-decision decoding has become an attrac¬ 
tive means of reliably increasing the transmission reach of 
lightwave systems. Currently, there are two popular classes of 
codes for soft-decision decoding that are attractive for imple¬ 
mentation in optical receivers at decoding throughputs of 100 
Gbit/s and above; Low-Density Parity-Check (LDPC) codes 
and Block Turbo Codes (BTCs). The latter can be decoded 
with a highly parallelizable, rapidly converging soft-decision 
decoding algorithm, usually have a large minimum distance, 
but require large block lengths of more than 100,000 bits 
to realize codes with small overheads, leading to decoding 
latencies that can be detrimental in certain applications. With 
overheads of more than 15% to 20%, these codes no longer 
perform well, at least under hard-decision decoding |[l). LDPC 
codes are understood and are suited to realize codes with 
lengths of a few 10, 000bits and overheads above 15%. 

Recently, the class of Spatially Coupled (SC) codes Q 
has gained widespread interest due to the fact that these 
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codes are asymptotically capacity-achieving, have appealing 
encoding and decoding complexity and show outstanding 
practical decoding performance. SC codes are an extension 
of existing coding schemes by a superimposed convolutional 
structure. The technique of spatial coupling can be applied 
to most existing codes, the most popular are however LDPC 
codes Q and BTCs Q, which have found use in optical 
communications (staircase codes) and show outstanding per¬ 
formance, operating within 0.5 dB of the capacity of the hard- 
decision AWGN channel. 

In this paper, we discuss the use of SC codes in optical 
communications and especially focus on SC-LDPC codes. 
We summarize some recent advances and design guide¬ 
lines for SC-LDPC codes and show by means of an Field- 
Programmable Gate Array ('EPGAj-based decoding platform 
that large gains at low bit error rates can be realized with 
relatively small codes when compared with state-of-the-art 
LDPC codes. The aim of this paper is to show that SC-LDPC 
codes are mature channel codes that are viable candidates for 
future optical communication systems with large NCGs. Fur¬ 
thermore, their universality makes them attractive for flexible 
transceivers with adaptive modulation. 

II. LDPC & Spatially Coupled LDPC Codes 

An LDPC code is defined by the null space of a sparse 
parity-check matrix H of size dimiT — M x N where the 
code contains all binary code words x of length N such that 
Hx^ — 0, i.e., Cldpc = {x a {0; 1}'^ : Hx^ — 0}. 

Each row of H is considered to be a check node, while 
each column of H is usually termed variable node. We say 
that the variable degree (or variable node degree) of a code is 
regular with degree dy if the number of ‘T”s in each column 
is constant and amounts to dy. We say that the check degree 
(or check node degree) of a code is regular with degree dy if 
the number of ‘T”s in each row of H is constant and amounts 
to dy. The class of irregular LDPC codes has the property that 
the number of ‘T”s in each column and/or row is not constant. 
The degree profile of an irregular LDPC code indicates the 
fraction of columns/rows of a certain degree. More precisely, 

i represents the fraction of columns with i ‘T”s (e.g., if 
half the columns of H have three ‘T”s). Note that 
Ei ^ 11,1 = 1 has to hold. Similarly, Uy^i represents the fraction 
of rows (i.e., checks) with i ‘T”s. 

LDPC codes form an important class of codes in optical 
communications Q. LDPC codes with soft-decision decoding 
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are currently being deployed in systems operating at 100 Gbit/s 
and, e.g., utilizing 16 iterations Q. Modern high-performance 
FEC systems in optical communications are sometimes con¬ 
structed using a soft-decision LDPC inner code which reduces 
the BER to a level of 10“^ to 10“® and a hard-decision 
algebraic outer cleanup code which pushes the system BER 
to levels below 10“^^ |j^. The outer cleanup code is used to 
combat the error floor that is present in most LDPC codes. 
Note that the implementation of a coding system with an outer 
cleanup code requires a thorough understanding of the LDPC 
code and a properly designed interleaver between the LDPC 
and the outer code. Recently, there has been some interest 
to avoid the use of an outer cleanup code and to use only 
soft-decision LDPC codes with very low error floors, leading 
to coding schemes with less rate loss and less latency. With 
increasing computational resources, it is now also feasible to 
evaluate very low target BERs of LDPC codes and optimize 
the codes to have very low error floors below the system’s 
target BER Q. Although the internal data flow of an LDPC 
decoder may be larger by more than an order of magnitude |[8| 
than that of a BTC, several techniques can be used to lower 
the data-flow, e.g., the use of layered decoding Q and min- 
sum decoding, requiring only two g-ary, dc +1 binary and one 
[log 2 del-ary message per check node. 

SC-LDPC codes were introduced more than a decade ago 
m but their outstanding properties have only been fully 
realized recently, when Lentmaier et al. noticed GD that 
the estimated decoding performance of a certain class of 
terminated protograph-based SC-LDPC codes with a simple 
message passing decoder is close to the performance of 
the underlying code ensemble under Maximum Likelihood 
(ML) decoding as n grows, which was subsequently proven 
rigorously in Q, fT^ , if certain particular conditions on the 
code structure are fulfilled. 

A left-terminated SC-LDPC code is basically an LDPC code 
with a structured, infinitely extended parity-check matrix 
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with Hi{t) being sparse binary parity-check matrices with 
dim Hi (t) = m X n and p denoting the syndrome former 
memory of the code. Every code word x of the code has 
to fulfill = 0. One advantage of SC-LDPC codes 

is that the infinitely long code words can conveniently be 

Originally, these codes were called LDPC convolutional codes. The term 
“spatially coupled” has been introduced to denote the more general phe¬ 
nomenon of coupling several independent code(word)s, by a superimposed, 
convolutional-like structure. 


decoded with acceptable latency using a simple windowed 
decoder HI- In practice, in order to construct codes of 
finite length, e.g., to adhere to certain framing structures in 
the communication system at hand, the infinitely extended 
matrix Hsc is terminated resulting in finite length code. One 
example of termination is zero-termination, where the matrix 
Hsc is cut off after t = L parts, resulting in a code of 
length N = Ln and a parity-check matrix of size 

dim = (L -f ff)m x Ln. Note that this termination 

leads to a rate loss, which can however be kept small if L is 
chosen large enough. Lor a discussion of termination schemes, 
we refer the interested reader to 0, ID- 

sc codes are now emerging in various applications. Two 
examples of SC product codes are the staircase code 0 and 
the braided BCH codes eg, for hard-decision decoding in 
optical communications. SC-LDPC codes may also be viable 
for pragmatic coded modulation schemes og, |Tg. 

In order to simplify the design of hardware, we first drop 
the time dependency and only consider the time-independent 
(left-terminated) parity-check matrix with Hflf) = Hi, Vz G 
{0,1,...,/i}, which is attractive for implementation as the 
sub-matrices Hi can be easily reused in the encoder and 
decoder hardware. In this time-invariant construction with 
dimffi = TO X n, we can give the following upper bound 
on the minimum distance of the code 0 Eq. (7)] 

^min ^ (to- b 1)(/XTO-b 1). (2) 

To construct codes with large enough minimum distances, we 
maximize the size of the sub-matrices Hi, i.e., to, which has a 
quadratic influence on 0. In order to keep the complexity of 
the so-constructed code small, we restrict ourselves to small 
values of the syndrome former memory p with either /x = 1 
or /i = 2. We call such codes weakly coupled codes GZl- 

III. Rapidly Converging SC-LDPC Codes 

In the past, irregular block LDPC codes have been used 
to design codes that perform very well for low SNRs, but 
these schemes do sometimes suffer from relatively high error 
floors requiring the use of an outer code that leads to inherent 
rate losses. In the case of SC-LDPC codes, we can use the 
irregularity to control the propagation speed of the decoding 
(1) wave of a windowed decoder, i.e., we can minimize the 
number of iterations /^q that are necessary until a windowed 
decoder can advance by one step m- To simplify the code 
construction and to illustrate the concept, we only use the most 
simple form of irregularity and construct slightly irregular SC- 
LDPC codes with degree-3 and additionally with either degree- 
4 or degree-6 variable nodes. We avoid degree-2 variables 
nodes due to their potentially detrimental effect on the error 
floor. Also, in contrast to block LDPC codes, degree-2 nodes 
are not of the same importance for SC-LDPC codes. We vary 
the fraction of degree-4 or degree-6 nodes between 0 and 1 and 
select the check nodes such that a rate r = | (25% overhead) 
code is constructed. We perform full density evolution using 
the irregular version of Kudekar’s {dy,dc, w, L) ensemble 0 
for random spatial coupling with w = 3 using an AWGN 
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Fig. I. Required E]^/Nq to operate a windowed decoder with /req iterations 
per segment for slightly irregular weakly coupled SC codes with degree- 

3 nodes and additionally degree-4 (dashed lines) or degree-6 (solid lines) 
variable nodes, (results obtained by density evolution) 

channel and measure the required E\,/Nq values to advance 
the decoding wave by Ii-sq steps. 

The density evolution results are shown in Fig.[T]for varying 
av ^4 and a^.e- We can see that using additionally degree- 

4 (besides degree-3) variables does not lead to noteworthy 

gains, which is why we focus on additional degree-6 nodes 
in this paper. The convergence speed improves by selecting a 
proper value of g leading to a smaller required E\,/Nq. The 
selection depends however on Ireq. As we intend to construct 
low complexity decoders with /req € we can see 

that in this case, the optimum is achieved with = 0.2 
(20% of degree-6 variable nodes, 80% of degree-3 variable 
nodes). We can see that by proper selection of a„ 6> we can 
obtain codes that have an improved decoding convergence, 
however, we also see that depending on the selection, a worse 
convergence behavior than for the regular case can result. We 
also observe that if we want a code that operates extremely 
close to capacity, the optimum value of is larger (around 
0.5) than for the more practical case, where the optimum lies at 
a«,6 = 0.2. Note that although we use Kudekar’s ensemble for 
density evolution, the codes we construct in the next section 
are generated from protographs, similar to those in 0, as 
these exhibit better finite length performance. 

A. FPGA-based Verification 

In order to verify the performance of the rapidly con¬ 
verging weakly coupled SC-LDPC codes, we use a Field- 
Programmable Gate Array (FPGA) platform, whose high-level 
diagram is illustrated in Fig. min)- This platform is similar 
to other platforms reported in the literature Q and consists 
of three parts: A Gaussian noise generator, an FEC decoder 
and an error detecting circuit. The Gaussian noise generator 
generates Gaussian distributed Log-Likelihood Ratios (LLRs), 
stemming from BPSK transmission over an AWGN channel, 
using uniform random number generators and the Box-Muller 
transform. These are then fed to the LDPC decoder after 


.-'xD 



Fig. 2. High-level schematic of the FPGA evaluation platform. 


quantization to 15 levels. The LDPC decoder is based on 
the layered decoding algorithm ||^ and uses a scaled-minsum 
check node computation rule with constant scaling factor. 

The windowed decoder that is implemented can be sub¬ 
divided into three steps. In the first step, a new sub-block 
of n quantized LLRs is received from the random number 
generator and put into the vacant position of the decoder’s LLR 
memory. Decoding takes place by considering IF = 13 copies 
of TT/i-i ■ • • Hq). The windowed decoded considers 
an equivalent matrix of size Wm x {W + ^— l)n which 
it processes before shifting in n new values. In order to 
maximize the hardware utilization, within a window, we use 
two parallel decoders that operate on non-overlapping portions 
of that matrix. In a first step, the first decoding engine operates 
on the first m check nodes of the matrix under consideration 
while the second engine operates in parallel on the m check 
nodes starting at position 6m. In general, the first engine 
processes the check nodes at position i G [l,lFm] while 
the second engine processes the check node (i 6m — 1) 
mod Wm-h 1. Note that only a single iteration is carried out 
to guarantee the required throughput, corresponding effectively 
to 2W iterations per bit (due to the use of two engines). 

The output of the LDPC decoder is connected to the 
BLR evaluation unit, which counts the bit errors and reports 
the error positions. We use Virtex-7 LPGAs allowing for 
a throughput of several Gbit/s to evaluate the BLR perfor¬ 
mance of several coding schemes of rate r = | = 0.8, 
i.e., of 25% coding overhead. We select this particular rate 
due to its importance in today’s Dense Wavelength Division 
Multiplex (DWDM) systems. Current and future 100 Gbit/s 
(with QPSK) or 200 Gbit/s (with 16-QAM) systems are often 
operated in 50 GHz channels with an exploitable bandwidth 
of roughly 37.5 GHz due to Reconfigurable Add-Drop Mul¬ 
tiplexers (ROADMs) with non-flat frequency characteristic. 
With almost rectangular pulse shapes (root-raised cosine with 
small roll-off a) and today’s generation of Digital-to-Analog 
Converters (DACs), symbol rates of 32 GBaud can be realized. 
With dual-polarization QPSK transmission, gross bit rates of 
128 Gbit/s can be realized. Assuming signaling and protocol 
overheads of 3 Gbit/s, this leads to a code that adds 25 Gbit/s 
parity overhead (i.e., of rate r = 0.8). We compare three codes: 

V As reference, we consider a regular block QC-LDPC code 
(marker V) with variable node degree = 3 and check 
node degree dc = 15. The code is a quasi-cyclic code 
of girth 10 and block length TV = 31, 200, constructed 
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using cyclically shifted identity matrices of size 32 x 32 
and decoded with I = 26 row-layered iterations. 

□ SC-LDPC Code A (□) is the rapidly converging irregular 
code with syndrome former memory p = 2, = 0.2 

and a„ 3 = 0.8 and check node degree = 18. The 
sub-block size is n = 7500 (dimiT^ = 1500 x 7500). 

<} SC-LDPC Code B (0) is a regular dy = L code with 
dc = 20 and syndrome former memory /i = 1. The size 
of the sub-matrices is identical to those of SC-LDPC code 
A, however, we select /r = 1. 

Both SC codes are constructed from cyclic permutation matri¬ 
ces of size 30 X 30 and are terminated after L = 90 subblocks. 
The simulation results are shown in Fig. The block code, 
which has a matrix that has been optimized for low error 
floors, is outperformed by both SC-LDPC codes. SC-LDPC 
code A offers a coding gain of around 0.3 dB at a BER of 
10“^^ compared to the conventional block LDPC code, but an 
error floor starts to manifest. This error floor is not due to any 
trapping sets, but due to a few uncorrected bits after windowed 
decoding, which can be recovered with a few-error correcting 
outer code. Code B has a BER curve that starts to decay at 
worse channels, but the BER curves cross at « 10“^^. Eor 
the next simulated point, we did not observe any bit errors, 
and hence we conjecture a lower error error floor than for 
Code A. Note that no special measures have been taken to 
combat an error floor; only a plain scaled min-sum decoder 
has been used. With the block code, post-processing 0 may 
be necessary to combat the error floor. 

Another advantage of SC-LDPC codes is that they are 
future-proof: While the block code does not benefit from 
further decoding iterations, as its performance is already close 
to its decoding threshold, the scaling behavior of the SC- 
LDPC code allows to carry out further iterations and achieve 
still larger coding gains, as the gap to the decoding threshold 
is still non-negligible. This makes these codes attractive for 
standardization. 

IV. SC-LDPC Codes for Modulation and Detection 

As future optical networks tend to become increasingly 
flexible and elastic, transceivers that integrate a certain amount 
of flexibility with respect to coding and modulation formats 
are required. Especially the modulation format is expected 
to change when transceivers are designed for long-haul or 
short-haul applications, where the latter require high spectral 
efficiencies (e.g., data center interconnects). In this section, we 
show that SC codes are perfectly suited to be combined with 
varying modulation formats due to their universality proper¬ 
ties CD- We combine SC-LDPC codes with a modulator and 
use density evolution to show how the detector front-end in¬ 
fluences the performance of the codes. In conventional (block) 



Fig. 3. Simulation results with FPGA-based windowed decoding, W = 13, 
two decoder instances. 

LDPC code design, usually the code needs to be “matched” to 
the transfer curve of the detection front-end CD- If the code is 
not well matched to the front-end, a performance loss occurs. 
If the detector front-end has highly varying characteristics, due 
to, e.g., varying modulation formats or channels, several codes 
would need to be implemented and selected depending on the 
conditions, which is not feasible in optical networks, where 
feedback is usually difficult to realize and where different 
codes cannot be implemented due to hardware constraints. 

In contrast to many block LDPC codes, spatially coupled 
LDPC codes can converge below the pinch-off in the EXIT 
chart due to the effect of threshold saturation Q. Hence, even 
if the code is not well matched to the demodulator/detector 
from a traditional point of view, we can hope to successfully 
decode. We can hence use a single code which is universally 
good in all scenarios and the code design can stay agnostic to 
the channePdetector behavior. In order to illustrate the concept, 
we model the detector by a linear EXIT characteristic 

4"'=/D(/f) = a-/f+/c-^ 

where a controls the slope of the characteristic and Iq = 
Jq fD{T)dT describes the mutual information of the com¬ 
munication channel. The slope a models the effect of e.g., 
different modulation formats, different bit labelings in higher 
order modulation and different detectors. We assume that the 
output of the detector can be modeled using a Binary Erasure 
Channel (BEC). There therefore also use BEC message pass¬ 
ing. We compare two different code approaches; first we use 
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We have presented an example of such a system with 
differential detection (a « 0.2) that is adapted to a channel 
with varying phase noise in pT[ . Therein, a single spatially 
coupled code was able to outperform two different LDPC 
codes optimized for different channel characteristics. 

V. Conclusions 

In this paper, we have highlighted Spatially Coupled (SCj- 
LDPC codes as potential candidates for future lightwave 
transmission systems. We have optimized SC-LDPC codes for 
convergence speed and shown by means of an FPGA-based 
simulation that very low error rates can be obtained. Finally, 
we have shown that SC-LDPC can be good candidates if they 
employed in a system with iterative decoding and detection: 
a single code can be used in various channel conditions. 


Fig. 4. Decoding thresholds of different SC-LDPC codes for varying detector 
characteristics with varying slope a and a regular (3, 6) block code. 


the spatially coupled {dv,dc,w,L) ensemble presented in Q 
with the density evolution equation for iterative detection given 
by (|^ where L(^) = Li denotes the node-perspective 

degree distribution polynomial, A(^) = L'(^)/L'(l) the edge- 
perspective degree distribution, and xl the edge message 
erasure probability of spatial position i at iteration i. Addition¬ 
ally, we generate protograph based codes end employ Multi- 
Edge-Type (MET) density evolution GD including iterative 
detection. We consider two code families of rate 0.8: The 
first family is the rapidly converging code from Sec. Ill with 
L(x) = -f |a;® and dc — 18 where we use w — 3 

and L = 100 in Kudekar’s ensemble and Bq = Bi = 
JB 2 = (2 1 1 1 l) with /i = 3 in the protograph 

ensemble. The second code is a regular code where we use 
Kudekar’s l{dy = A, dc = 20, w = 3, L = 100) ensemble 
and a protograph ensemble with Bq = (l 2 1 2 l) and 

Bo = (3 2 3 2 3) with p = 2 
Figure shows the DE results where we use solid lines 
(-) to show the decoding thresholds for Kudekar’s en¬ 
semble and dashed lines (-) for the protograph-based en¬ 

semble. All SC codes have decoding thresholds close to the 
theoretical limit of /c,max = 0.2 and the decoding threshold is 
almost independent of the detector characteristic’s slope a. A 
regular block LDPC code has a highly varying threshold for 
different slopes a. The flat threshold behavior for SCLDPC 
codes indicates a universal, channel-agnostic behavior. Even 
an optimized irregular LDPC code will only be good for a 
single slope parameter pO) . In order to improve the decoding 
threshold, we may deliberately select a precoder that has an 
EXIT characteristic with slope a > 0, however, as the inset of 
Fig .|^shows, the slope affects the decoding speed (measured at 
Ic = 0.185), i.e., the number of iterations required to advance 
the decoding wave by one step, so that the complexity will 
grow alongside. For the case of the rapidly converging code, 
a > 0 further increases the decoding speed. 
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